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We review a statistical mechanics treatment of the stability of globular proteins based on a simple 
model Hamiltonian taking into account protein self interactions and protein-water interactions. The 
model contains both hot and cold folding transitions. In addition it predicts a critical point at a 
■ given temperature and chemical potential of the surrounding water. The universality class of this 

Q\ ' critical point is new. 
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Biologically relevant proteins are macromolecules ^] whose structures are determined by the evolutionary process 
The folded conformation of globular proteins is a state of matter peculiar in more than one respect. The density 
is that of a condensed phase (solid or liquid), and the relative positions of the atoms are, on average, fixed. These are 
the characteristics of the solid state. However, solids are either crystalline or amorphous, and proteins are neither: 
y—i , the folded structure, while ordered in the sense that each molecule of a given species is folded in the same way, lacks 
the translational symmetry of a crystal. Unlike any other known solids, globular proteins are not really rigid, being 
able to perform large conformational motions while retaining locally the same folded structure. Finally, these are 
mesoscopic systems, consisting of a few thousand atoms. 
iy-j \ Quantitatively, the peculiarities of this state of matter are perhaps best appreciated from thermodynamics. Delicate 
calorimetric measurements on the folding transition of globular proteins reveals the following picture. The 

0^ ' transition is first order, at least in the case of single domain proteins. The stability of the folded state, i.e., the 
difference in Gibbs potential AG between the unfolded and the folded state is at most a fraction of kT room per 
aminoacid. This is refered to as "cooperativity" . The Gibbs potential difference AG, as a function of temperature, 
is non monotonic: it has a maximum around room temperature (where AG > and consequently the folded state is 
stable), then crosses zero and becomes negative both for higher and lower temperatures. Correspondingly, the protein 
■ unfolds not only at high, but also at low temperatures. The melting transition under cooling is refered to as "cold 
unfolding" or "cold denaturation." For temperatures around the cold unfolding transition and below, the enthalpy 
difference AH between the unfolded and the folded state is negative; this means that cold unfolding proceeds with 
a release of heat (a negative latent heat), as is also observed experimentally; at the higher unfolding transition, on 
the contrary, AH > which corresponds to the usual situation of a positive latent heat. There are two peaks in the 
specific heat, corresponding to the two unfolding transitions, and a large gap AG in the specific heat between the 
unfolded and the folded state. This gap is again peculiar to proteins: usually, for a melting transition AG is small. 

It is (however not universally) believed that from the microscopic point of view, the main driving force for folding is 
the hydrophobic effect; in the native state of globular proteins hydrophobic residues are generally found in the inside 
of the molecule, where they are shielded from the water, while hydrophobic residues are typically on the surface. 
Hydrogen bonds within the regular elements of secondary structure (a helices and /3 sheets), while necessary for 
the stability of the native state, can hardly be thought of as providing the positive AG of the folded structure, 
since the unfolded structure would form just as many hydrogen bonds with the water. When the protein unfolds, 
the hydrophobic residues of the interior are exposed; this accounts for most of the gap in the specific heat AG [||, 
according to the known effect that dissolving hydrophobic substances in water raises the heat capacity of the solution 
. A recent discussion of hydrophobicity in protein folding may be found in Ref. |§| . 
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As in other branches of physics, once the thermodynamics of a system is known it is desirable to develop a 
corresponding statistical mechanics model. In the following, we describe a recently proposed model of this kind 
that accounts for the strange thermodynamical behavior described above [pHl^t . Its starting point is the simple but 
appealing "zipper model" ]l3| , which was introduced to describe the helix - coil transition. In this model, the relevant 
degrees of freedom (conformational angles) are modeled through binary variables. Each variable is either matching 
the ordered structure (helix), or in a "coiled" state. A related parametrization for the 3-d folding transition has 
been proposed by Zwanzig [i4j , describing it in terms of variables ipi, each of which is "true" (1) when there is local 
match with the correct ground state, or "false" (0) if there is no match. The term "local" is here defined through the 
parametrization index i. A zipper scenario that deals with the initial pathway of protein folding has been proposed 
by Dill et al. ]l5| |. We can parametrize this model in the same way as done by Zwanzig by assigning the value one 
to each of the binary variables ipi describing closed contacts in the zipper. Build into the model is that opening and 
closing of contacts occur in a particular order: They behave as the individual locks in a zipper. This ordering is 
characterized through imposing the constraints 

Ipi > "04+1 ■ (1) 

The variables ipi alone cannot describe the degrees of freedom that become liberated when a portion of the zipper 
is open. The open part of the zipper may move freely (ipi = 0) whereas they cannot move in the part of the zipper 
where the contacts are closed (ipi — 1). In order to take into account this effect, we introduce a second, independent 
set of variables £j. For simplicity, we also make these variables binary taking the values 1 or -B. We are now in the 
position to propose a Hamiltonian for this zipper model, 

JV 



H = -X>& . (2) 



subjected to the constraints (0). 

We note that for any finite value of B, parts of the protein may unfold inside the already folded region i.e. in the 
parts of the zipper where ipi = 1. In order to prevent this, we assume B to be sufficiently large compared to any other 
energy scale in the system — in particular kT, where T is the temperature — so that the £j variables never assume 
the value -B as long as ipi = 1. 

We will in the following use this Hamiltonian as a starting point for analyzing the hot and cold denaturation 
transitions of proteins when dissolved in water fllcfl . It is awkward to work with the Hamiltonian (|^) directly because 
of the constraints (|]) . We therefore make a transformation to a different set of variables where the constraints (|]) are 
implicitly taken into account. We define a set of binary, unconstrained variables tpi, by the following relation: 

ipi =ipi---ipi . (3) 

In particular, ip\ — ip\. In the limit when B — > oo, the Hamiltonian (^) becomes 

H = - cpi - ipiip 2 - tpi<P2P3 ¥>i¥>2 • • • Vn , (4) 

where there are no additional constraints The role of the variables — which is to provide entropy to the 

unfolded part (ipi = 0) of the zipper — is now played by the degeneracy introduced into the Hamiltonian in the 
following way: When a particular ipj — 0, the Hamiltonian (^J) will be degenerate with respect to the variables kpi 
where i > j. 

The interactions between protein and water may be taken into account by adding to (|j) a coupling parametrized 
through water variables wx, u>2, itfjv ^M - Returning for a moment to the original variables ipi, we propose an 
interaction (1 — ipi^i)Wi. The rationale behind this form is that when a contact is open (ipi = 0), the part of the 
protein parametrized by i is exposed to water and interact, while if the contact is closed (ipi = 1), there is no access 
to the water and the interaction is zero. Returning to the new variables ifi, the resulting Hamiltonian is 

H = - £ (tpx + ipiip 2 + tpi<P2<Pz H 1- ^1^2 • ■ • Vn) + [(1 - Vi)wi + (1 - VW2)w 2 + ... + (1 - <pi<fi2 ■ ■ ■ <pn)wn] , 

(5) 

where we have introduced a scale parameter £q in order to vary the relative strength of the protein self interactions and 
the protein-water interactions. In order to model hydrophobicity, we assume the Wi variables take values £ m i n + sA, 
s = 0, 1, ...,<?— 1. Here, A is the spacing of the energy levels of the water-protein interactions. The equidistant 



2 



energy levels reflect the experimentally observed approximate constant heat capacity at intermediate temperatures, 
whereas the finite number of levels g takes into account that protein-water interactions vanish at high temperatures, 
in practice above 120 degree Celsius. 

The number of terms in the Hamiltonian (^|), N, is the number of contact in the zipper model. This number may 
be equal to the number of amino acids, but is a priori unknown. It is important to realize that if one parametrize 
the folding with fewer steps N, each unit will be larger and energies and entropies appropriately increased (inversely 
proportional to N). 

The calculation of the partition function is straightforward. We parametrize the states of the system by the number 
n of consequtive matches ipi = 1, </?2 = 1, — 1 and ending with ip n +i — and the values {s„+i, sn} where 

each Si € {0, 1, 2, ...,<? — 1} for the (N — n) /i variables coupled to the unfolded portion of the protein. The energy of 
this state is 

N 

e(n, s n+ i,...,s N ) = -n£ + { £ min + AS s,-) (6) 

i—n+l 

where we have introduced the energy scale £q for the protein variable in order to make the formulas dimcnsionally 
more transparent (up to now we used £q = 1). Denoting [3 = 1/T as the reciprocal temperature, the partition function 
is 

N-l g—l g—l g-1 

Z= Y / 2 N - n - 1 g n E E exp(-/fe(n, ai , ■■■,**)) + g N exp( (3£ N) (7) 

In the above equation the factor 2 N ~ n ~ 1 is the degeneracy of the unfolded protein degrees of freedom and the factor 
g n is the degeneracy of water which is not exposed to the inside of the protein. Factorizing the sums over st into 
partition functions Z w 

Z = \ (2Zyf( 9 -^plY + (geMPS )f (8) 

~ n— ^ w / 

where the phase space for a water degree of freedom exposed to a unfolded protein degree of freedom is 



9-1 

E 

s=0 



eM-P(S min + sAS)) = (exp(-^ m )-exp(-/ ?gma ,) 

(1 - exp(-pAS)) 



where £ ma x — £min + gA£ . From Eq. ^ one sees directly that the state of the system is determined by the size of the 
quantity 

,gexp(/3£ ) 

r = — = exp(/3A/) (10) 

LZj w 

If A/ > then the system will be in the folded state because the sum in Eq. (||) is dominated by the last term, 
whereas for Af < the system will be unfolded. 

The sum in Eq. (^) can be readily performed and the total partition function is 

Z = -(2Z W ) N 1 ~ ^MSoP) / (2Z w )f {g eM(3£o)) N (11) 
2 l w > l-(gexp(£ /3) / {2Z W )) yy y ' 

The free energy is F = — Tln(Z), the energy E = — dh\(Z) / dfi and the heat capacity C = dE/dT. Because 
there is no pressure in the model, the energy E takes the place of the enthalpy H = E + pV and the free energy 
F = E-TS takes the place of the Gibbs potential G = H -TS. 

In Fig. [I] we show the heat capacity the three different choices of £ m im representing three different values of 
the chemical potential as we discuss later. The characteristic feature is that there are two peaks corresponding to 
warm and cold unfolding, and a gap AC in the heat capacity between the unfolded and the folded form. At higher 
temperatures, i.e., T > gA£, the gap goes to zero because the water becomes effectively degenerate again. In Fig. 
^| we show the order parameter (n) as function of temperature for three values of the chemical potential. The figure 
indeed confirms that the protein is folded between the two transitions. 
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We now calculate explicitly the difference in the thermodynamic functions between the unfolded and the folded 
state. We consider these quantities per degree of freedom. The thermodynamic functions associated to a folded 
(f) protein variable is the energy e/ = — So, the entropy s/ = ln(g) and the free energy ff = —£ — T\n(g). The 
free energy associated to an unfolded (u) protein variable is given by the corresponding partition function of water 
multiplied by the degeneracy factor of an unfolded part of the protein: f u = —Thi(Z w 2). The difference in free 
energy between folded and unfolded state is accordingly 

A/ = /„-/,= T In(^f^l) (12) 

which is the quantity we earlier identified as the one which decides whether the system cooperatively selects the folded 
or the unfolded state. To clarify the physical contents of this formula we rewrite it for small energy level spacings 
AS « T: 

Af = £ + S min + Tln(^) - T\n(l-exp(-(S max -S min )/T)) (13) 

From this expression for the difference in free energy one easily obtains the corresponding differences in energy, entropy 
and specific heat. In particular, we obtain a gap in the specific heat between the folded and unfolded state of a protein 
degree of freedom Ac = (A£/T) 2 /(e A£ / T - l) 2 e A£ / T - exp(A£/T) - 1 for temperatures T G [AS, S min + S max ], 
see Fig. |l[ 

To simplify the discussion let us consider the limit of large S max in ( |l3| ) . It is easily seen that Af has a maximum 
at the temperature T m « gAS/2e . The corresponding value of Af is AF(T m ) « (£ m m + So) + gAS/2e , so the 
condition for the existence of a region of stability of the ordered structure (Af > 0) is: 

^ > ~(S mm + So). (14) 
ze 

This is of course always satisfied if (S m i n + So) > , however the more interesting situation is (S m i n + So) < 0, since 
then AF < at sufficiently low temperature, i.e. the phenomenon of cold unfolding appears. Under these conditions 
AE is also negative at sufficiently low temperature which means that we have a negative latent heat for cold unfolding. 
Coming back to the partition function (Q) and (JsJ) , we may write: 

W N 

S = -NS + (N - n)(S + S mm ) + ^ AS s, = - NS + J] [AS * + S + S min ] (15) 

i—n+l i=n-\-l 

and 

z = e 0NS o J2 2 N - n - 1 g n e _/S ^<=»+* (£ ^ } + g N eMP£oN) (16) 
n=0 { Si } 

where we have set Si — AS s, , fx = —(Sq + S m i n ). From this expression for Z we can identify \i with the chemical 
potential of the water , or, to be more precise, the difference in chemical potential of the water when it is in contact 
with the hydrophobic interior of the protein and when it is not. Therefore, fi > is the physically relevant situation. 
Experimentally, \i can be changed by adding denaturants, changing pH, etc., which indeed alters the stability of the 
ordered structure. 

For an intermediate value of the chemical potential, r — defined in Eq. (^) — just touches the line r = 1, that is 
dr/dT = when r — 1, corresponding to a merging of two first order transitions. This defines a critical point. Around 
this point, r varies quadratically in T — T c and linearly in /i — /i c , as seen from expanding Eq. (^). In experiments 
of protein folding this point is accessible by changing the pH value of the solution. In fact, Privalov's data on low 
pH values indeed indicate that such a critical point exists. The scaling properties around this point thus opens for a 
possibility to gain insight into the nature of the folding process, in particular whether the pathway scheme we suggest 
can be falsified. 

In Fig. ^ we show heat capacity as a function of temperature for chemical potential at the critical value fj, = fi c . For 
the chosen values of So = 1 and level density A = 0.02 and g = 350 the critical point is situated at T c = 1.33303 . . ., 
fi c = 1.2838 . . .. That is, it is situated at a minimum of the heat capacity curve. This is at first sight surprising, 
usually heat capacity has a pronounced increase at the critical point. The minimum reflects a partial ordering, as 
envisioned in Fig. [| where we show the degree of folding, counted by the average number of folded variables ipi = 1, 
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i = 1, ...,n from i — 1 until the first variable i = ?i + 1 which takes value ip n +i = 0. The average value of this (n) 
is N/2 at the critical point, reflecting that the system is on average half ordered at this point. Correspondingly the 
heat capacity dips to a value in between the value of an unfolded and a completely folded state. 

To characterize the functional form of the dip in the heat capacity, we investigate analytically C S i„ g (T) = C(T, fi) — 
C(T, ri c ) with /i >> /jL c for different values of the size N. For finite N we may express the singular part of the heat 
capacity in the form: 

C smg = \T c -T\- a ((T c - TJJV 1 /") (17) 

where g(x) — > const when x — > oo and g(x) oc x a when x — > 0. We find analytically a — v = 2 from differentiating 
the partition function (||). Fig. || demonstrate this finite size scaling. Similarly we show in Fig. [| the behavior of the 
order parameter (n) as function of T — T c and N: 

(n) = \T-T c f f([T-T c )N l ' u ) (18) 

with f(x) — > const when a; — > oo and /(#) cx a; - ' 3 when x — > where exponents /3 = —2, also found analytically. It 
may be surprising that [3 is negative, but this reflect in part the unusual use of an extensive (in N) order parameter, 
in part that for /i = /i c then the order parameter only obtains a non-zero value at T = T c when N — > oo. 

Likewise, we find that the susceptibility x = d{n) /diL scales as |T — T c >" where 7 = 4 and that (n) oc (/i— Uc) 1 ^ 6 for 
fi > fi c where (5 = —1. Thus the usual exponent relations, a + 2/3 + 7 = 2, a + /3((5+l) = 2, and 7(<5 + l) = (2 — ct)(5 — 1) 
are fulfilled jl7|]. However the hyperscaling relation = 2 — a, where d is the dimensionality of the system, is not 
fulfilled. However, this relation has no meaning, as there are no spatial degrees of freedom. 

In terms of experiments on proteins, the relevant scaling behaviour is the how the degree of folding (order parameter) 
and the heat capacity behaves as function of temperature, when one changes chemical potential away from its critical 
value. The qualitative prediction is that the width of the singular part of the heat capacity has a minimum at the 
critical value \x = fi c . The broadening of the heat capacity is 

C smg {T-T c f = h(^0\ for v>Vc (19) 

where h(x) oc x~ 2 for x — * oo and h(x) — const for x — > and where A/i = max(/i — /i c , A/x m i n ) with Afi min oc 1/N 
takes into account the finite size sensitivity of the scaling. We show in Fig. [|, an example of such a data collapse. 
These predictions are experimentally accessible through the use of standard calorimctric techniques, where one should 
seek to obtain a data collapse above the critical point, i.e. the point of minimal width. The heat capacity below the 
critical fi is complicated by the merging of two first order transitions. However, the distance between these moves 
away from each other in T as A^ 1 / 2 . 

Likewise, we expect the degree of folding (n) to show data collapse of the form 

(n)(T-T c ) 2 M > lie (20) 

where k(x) behaves asymptotically as h. We show this in Fig. || This quantity can be observed experimentally 
through fluorescence measurements. 
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FIG. 1. Heat capacity, C, as a function of T for three values of the chemical potential fi. Here g — 350, A = 0.02 and 
N — 100. The value N = 100 has been chosen as to be close to realistic values for this parameter. 



FIG. 2. Degree of folding, (n), as a function of T for three values of the chemical potential fi. The parameters are chosen 
as in Fig. [j]. 



FIG. 3. Finite size scaling of the heat capacity for /r — fj, c , g = 350 and A = 0.02. Here a = 2 and v = 2. 



FIG. 4. a) Finite size scaling of folding, (n)for /i = /j. c , g = 350 and A = 0.02. Here /3 = —2. 



FIG. 5. C sl ng(T - T c ) 2 vs. (T - T c )/A^ 1/2 . We have chosen N = 100, g = 350 and A = 0.02. Note the good quality of the 
data collapse in spite of smallness of the system. 



FIG. 6. (n)(T - T c ) 2 vs. (T - T c )/A/i 1/2 . We have chosen N = 100, g = 350 and A = 0.02. Note the good quality of the 
data collapse in spite of smallness of the system. 
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